Speech pattern matching in non-white noise

ABSTRACT

A system for matching an input signal, comprising non-white noise and a patterned signal corrupted by said non-white noise, to a plurality of reference signals, the system including an estimator estimating noise features of said non-white noise and producing from the noise features at least one noise whitening filter; a filter generally simultaneously filtering the input signal and the plurality of reference signals using the noise whitening filter and producing a filtered input signal, having a white noise component, and a plurality of filtered reference signals; and a pattern matcher generally robust to white noise for matching the filtered input signal to one of the filtered reference signals.

FIELD OF THE INVENTION

The present invention relates generally to pattern matching in noisyenvironments and to speech pattern matching in noisy environments inparticular.

DESCRIPTION OF THE PRIOR ART

Speech pattern matching is a known process in which an incoming testspeech segment, such as a speech utterance, is compared to a collectionof reference speech segments in order to find the reference speechsegment in the collection that is most similar to the test speechsegment. Similarity is defined by a score given to each referencesegment with respect to the input test speech segment. The reference andtest speech segments can each be represented by a set of features or bya model.

In the speech recognition task, the reference and test speech segmentsare uttered words and the collection of reference segments, known astemplates, constitutes a pre-defined dictionary. In the speakeridentification task, the reference segments are representative of voicesof different people. In speech coding, such as through VectorQuantization (VQ), the test and reference segments are usuallyarbitrarily short segments and the VQ method represents each testsegment by the index of the reference segment which is closest to it.

The matching capability of conventional algorithms deteriorates greatlyin the presence of noise in the input speech. One approach to solvingthis problem is by speech enhancement preprocessing, a process which isreviewed in the book Speech Enhancement, edited by J. S. Lim, andpublished by Prentice-Hall, New-York, 1983. Application of such methodsto speech recognition in a noisy environment in a car is described inthe article by N. Dal Degan and C. Prati, "Acoustic Noise Analysis andSpeech Enhancement Techniques for Mobile Radio Applications", SignalProcessing, Vol. 15, pp. 43-56, 1988.

Some methods have been described which perform speech matching in awhite noise environment for the purpose of speech recognition. Amongthem are the Short-time Modified Coherence (SMC) representation ofspeech, as described by D. Mansour and B. H. Juang, in their article,"The Short-Time Modified Coherence Representation and Noisy SpeechRecognition", IEEE Transactions on Acoustics, Speech and SignalProcessing, Vol. ASSP-37, pp. 795-804, June 1989.

Other methods use noise robust distortion measures, such as a projectiondistortion measure or a Weighted Likelihood Ratio (WLR). Methods usingthe projection distortion method are discussed in the article by D.Mansour and B. H. Juang, "A Family of Distortion Measures Based UponProjection Operation for Robust Speech Recognition", IEEE Transactionson Acoustics, Speech and Signal Processing, Vol. ASSP-37, pp. 1659-1671,November 1989.

Unfortunately, the abovementioned methods fail when the noise iscolored, as in the environment of a driving car.

Y. Ephraim, J. G. Wilpon and L. R. Rabiner, in the article, "A LinearPredictive Front-end Processor for Speech Recognition in NoisyEnvironments", International Conference on Acoustics, Speech and SignalProcessing, ICASSP-87, pp. 1324--1327, Dallas Tex., 1987, present amethod for speech recognition suitable for colored noise. In thismethod, the power spectrum of the noise is used, in an iterativealgorithm, to estimate the Linear Prediction Coefficients (LPC) of cleanspeech from its noisy version. This algorithm requires extensivecomputations.

This last method and the SMC method were applied to speech recognitionin car noise by I. Lecomte, M. Lever, J. Boudy and A. Tassy. Theirresults are discussed in the article, "Car Noise Processing for SpeechInput", International Conference on Acoustics, Speech and SignalProcessing, ICASSP-89, pp. 512-515. Glasgow UK, 1989.

SUMMARY OF THE INVENTION

It is an object of the present invention to overcome the problems of theprior art and to provide a pattern matching system which is operative inthe presence of colored and quasi-stationary noise and which iscomputationally efficient. The present invention may be used, forexample, for speech recognition, speaker identification andverification, or vector quantization (VQ) for speech coding.

The system includes the following three operations:

1) Noise modeling: Noise is collected from a noisy input test signal inthe intervals containing no speech. The features of the noise areextracted and are used to construct a noise whitening filter forwhitening the noise.

2) Pre-Processing: The noisy input test signal and a plurality ofreference template signals, each containing a previously storedreference speech signal which can be accompanied by noise, are filteredthrough the noise whitening filter. This produces modified test andreference signals wherein the noise component of the test signal iswhite. It also ensures that the test and reference template signals aremodified in identical ways.

3) Matching: A pattern matching algorithm which is operative in whitenoise is applied to the modified test and reference template signals.This operation involves scoring the similarity between the modified testsignal and each of the modified reference templates, followed bydeciding which reference template is most similar to the test signal,

The present invention can be applied, among others, to the followingproblems of speech processing in colored noise:

a) Speech recognition using Dynamic Time Warping (DTW) in the presenceof colored noise, such as is found in the environment of a car.

b) Vector quantization (VQ) of noisy speech.

c) Speech recognition using Hidden Markov Models (HMM) with a discreteor semi-continuous probability distribution and using VQ.

d) Speaker identification or verification using DTW or HMM.

e) Speech compression through Vector Quantization of the LPCcoefficients or other characteristic features of the test utterance.

In accordance with the present invention there is provided a system formatching an input signal, including non-white noise and a patternedsignal corrupted by the non-white noise, to a plurality of referencesignals, the system including means for estimating noise features of thenon-white noise and for producing from the features at least one noisewhitening filter, filter means for filtering the input signal and theplurality of reference signals using the at least one noise whiteningfilter and producing a filtered input signal, having a white noisecomponent, and a plurality of filtered reference signals and patternmatching means generally robust to white noise for matching the filteredinput signal to one of the filtered reference signals. In the systemthus provided the at least one noise whitening filter is two noisewhitening filters respectively for filtering the input signal and thereference signals which system also includes means for extractingfeatures of the input signal and the reference signals and wherein thefilter means operate in a feature domain. A feature domain of the inputsignal is different than a feature domain of the reference signals. Thepattern matching means perform a pattern matching technique selectedfrom the group of DTW, HMM, or DTW-VQ. The input signal is a speechsignal.

In the system thus provided, the feature domains are selected from thegroup of data samples, Linear Prediction Coefficients, cepstralcoefficients, power spectrum samples, and filter bank energies and themeans for estimating estimate a filter in accordance with the selectedfeature domain.

Also in accordance with the present invention there is provided a speechrecognition system for recognizing words found in a speech signalcorrupted by non-white noise including means for estimating noisefeatures of the non-white noise and for producing from the features atleast one noise whitening filter, filter means for filtering the speechsignal and a plurality of reference signals of selected spoken words,the filter means using the at least one noise whitening filter andproducing a filtered speech signal and a plurality of filtered referencesignals and pattern matching means generally robust to white noise formatching the filtered speech signal to one of the filtered referencesignals thereby recognizing the word in the speech signal.

Further provided in accordance with the present invention is a VectorQuantization (VQ) system for vector quantizing a speech signal corruptedby non-white noise into a sequence of symbols, the system includingmeans for estimating noise features of the non-white noise and forproducing from the features at least one noise whitening filter, filtermeans for filtering segments of the speech signal and a plurality ofnumbered reference signals of selected speech segments, the filter meansusing the at least one noise whitening filter and producing filteredspeech segments and a plurality of filtered reference segments andpattern matching means generally robust to white noise for matching eachof the filtered speech segments to one of the filtered referencesegments and for providing as an output a symbol which is an index ofthe matched reference segment.

In yet another embodiment of the present invention there is provided aspeech recognition system for recognizing a word found in a speechsignal corrupted by non-white noise including a vector quantizationsystem according to claim 10 producing vector quantized speech and wordmatching means receiving a plurality of reference sequences of symbolsrelating to the reference signals and a test sequence of symbolsrelating to the speech signal for matching the test sequence to thereference sequences thereby to recognize the word in the speech signal.The word matching means performs Dynamic Time Warping (DTW) on thevector quantized speech and Hidden Markov Modeling (HMM).

In accordance with a further embodiment of the present invention thereis provided a speaker recognition system using any of the systemsdescribed above wherein the reference signals include one word spoken bya plurality of different speakers.

Additionally in accordance with an embodiment of the present inventionthere is provided a speaker verification system using any of the systemsdescribed above wherein the reference signals include at least one wordspoken by one speaker.

The non-white noise is the noise from the environment of a movablevehicle or, alternatively, the noise from the environment of a movingairplane cockpit or a vibrating machine.

In accordance with the present invention there is described a method formatching an input signal employing the system as described above.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be understood and appreciated more fully fromthe following detailed description taken in conjunction with thedrawings in which:

FIG. 1 is a schematic block diagram illustration of a pattern matchingsystem constructed and operated in accordance with a preferredembodiment of the present invention; and

FIG. 2 is a schematic block diagram illustration of the hardwareimplementing the pattern matching system of FIG. 1.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Reference is now made to FIG. 1 which illustrates a schematic blockdiagram of a pattern matching system constructed and operative inaccordance with the principles of the present invention.

The following discussion will present an embodiment of the presentinvention for matching patterned signals which are speech signals. Itwill be understood that this is for clarity of discussion only; thepresent invention is operative for all types of patterned signalsaccompanied by colored noise.

The pattern matching system of the present invention typically includesan input device 10, such as a microphone or similar device, forproviding an analog signal, and a sampling device 12 for converting theanalog signal to a digital signal. The samples of the digital signal aretypically grouped into frames, typically of 128 or 256 samples each.

The digital and analog signals typically include portions containingonly background noise which is typically non-white, such as colored orquasi-stationary noise, and some portions containing a signal whosepattern is to be detected, known herein as a "patterned signal". In thecase of speech systems, the patterned signal is the speech signal.

The patterned signal is typically corrupted by the colored noise. Thepresent invention seeks to match the patterned signal to a plurality ofpreviously stored reference signals wherein the patterned signal isreceived in the presence of the colored noise.

The reference signals are stored as reference templates includingfeature sets of the reference signals extracted via feature extractiondevices not shown. The reference templates are typically stored in areference template storage device 14, such as any suitable memorydevice, during a process called training (not shown). These templatesare representative of various patterned signals to which it is desiredto match the input patterned signal. For example, the referencetemplates might be feature sets of uttered words (for speechrecognition) or of utterances of various speakers (for speakerrecognition or verification). The reference templates might also becentroids of speech segments (for speech coding using VQ analysis).

The digital signal is supplied to a patterned signal activated detectiondevice 16 which generally detects the presence or absence of thepatterned signal. For speech signals, the device 16 typically is a voiceactivated switch (VOX) such as described in U.S. Pat. No. 4,959,865 toStettiner et al. U.S. Pat. No. 4,959,865 is incorporated herein byreference. The output of the device 16 are two signals, a noise signaland a "test utterance" including the patterned signal corrupted bycolored noise. For the present invention, the VOX typically does nothave to be precise.

The remainder of the present invention will be described for speechsignals, as an example only. It will be appreciated that the presentinvention is operative for other types of patterned signals also.

The noise signal is provided to a noise filter estimator 18, describedin more detail hereinbelow, for estimating parameters of a noisewhitening filter. The noise whitening filter can convert the colorednoise signal into a white noise signal.

The noise whitening filter thus estimated is used to filter both thetest utterance and the reference templates, as described in more detailhereinbelow.

The test utterance is provided to a first stage feature extractiondevice 24 which transforms the test utterance into a sequence of testfeature parameter vectors which can be any of several types of desiredfeatures, such as power spectrum samples, autocorrelation coefficients,LPC, cepstral coefficients, filter bank energies or other featurescharacteristic of the power spectrum of the test utterance.

Suitable feature extraction devices 24 are described in the book, SpeechCommunication--Human and Machine by Douglas O'Shaughnessy, published byAddison-Wesley of Reading, Mass. in 1987, which book is incorporatedherein by reference.

It will be noted that there is one feature vector per frame of the testutterance and one feature vector per frame of each reference template.

The chosen feature type is chosen by a system designer and typicallydepends on the pattern matching task required, the speed necessary forthe task and the hardware available to perform the task. Since all ofthe abovementioned feature types contain basically the same information,any of them can be utilized.

For speech or speaker recognition tasks, each feature vector preferablycontains the features of one speech frame of approximately 30 msec. Anoverlap of typically 50% may be applied between adjacent speech frames.

The test vector is provided to a noise whitening filter 26, whoseparameters are estimated by filter estimator 18, for filtering the testvector so as to provide a filtered test vector in the presence ofapproximately white, rather than colored, noise. In this manner, knownmethods of matching test vectors to reference templates, which areoperative only for vectors corrupted by white noise, can be used.

As is known in the art, filters affect all of the vector being filteredand not just the components of the vector which it is desired to befiltered. Thus, the output of the noise whitening filter is a testvector in the presence of white noise whose speech component isdifferent than that off the original test vector.

Therefore, in accordance with a preferred embodiment of the presentinvention and in order to preserve the matching between the test vectorand the reference templates, the entirety of reference templates fromthe reference template storage device 14 are filtered by a noisewhitening filter 28 which is generally identical to noisewhitening-filter 26. In this manner, the reference templates to whichthe test vector is to be matched are adjusted in the same manner as thetest vector.

The reference templates are typically defined in the same feature set asthe test vector. If so, noise whitening filter 28 is identical to noisewhitening filter 26. If not, filter 28 is defined differently than thefilter 26 although both filters have an equivalent effect.

The noise whitening filters 26 and 28 are calculated as follows. Theparameters of each filter are such that the power spectrum of itsimpulse response is approximately the inverse of the power spectrum ofthe colored noise, as estimated from the most recent noise portionsreceived from the patterned signal detection device 16.

In accordance with a preferred embodiment of the present invention, thenoise whitening filters 26 and 28 are defined with respect to the samefeature sets that respectively describe the test utterance and thereference signals. Various ways to estimate and operate the filters 26and 28 exist and depend on the type of feature set used.

For feature sets which contain the samples of the test utterance: Filterestimator 18 estimates an Infinite Impulse Response (IIR) or a FiniteImpulse Response (FIR) filter. The latter is typically a moving averagefilter whose coefficients are estimated by LPC analysis of the noisesignal. The IIR or FIR is then applied to the samples of the testutterance.

For feature sets which contain power spectrum samples or filter bankenergy samples: Filter estimator 18 estimates the inverse of the averagepower spectrum of the noise signal. The filter operates by multiplyingthe test utterance power spectrum or filter bank energy samples by thecorresponding filter power spectrum values.

For feature sets which contain autocorrelation coefficients: Filterestimator 18 estimates the inverse of the average power spectrum of thenoise signal and converts it to the correlation domain. Theautocorrelation coefficients of the test utterance are then convolvedwith the filter coefficients.

For feature sets which contain cepstral coefficients: The filterestimator 18 estimates the cepstral coefficients of the noise signal.The cepstral coefficients of the noise are then subtracted from thecorresponding cepstral coefficients of the test utterance. Nosubtraction is performed on the zeroth coefficient of the testutterance.

The filtered test and reference feature vectors are then passedseparately through second stage feature extractor devices 30 and 32respectively, which are operative to transform the filtered featurevectors to feature vectors which are appropriate for the chosen patternmatching method, as described hereinbelow.

It will be appreciated that the first and second stage featureextraction devices are chosen together to produce the features necessaryfor the selected pattern matching method. The two stages are necessaryto enable the filter estimation to be performed with whichever featuretype a designer desires, whether for reasons of computation ease orspeed.

The second stage feature extraction devices 30 and 32 (one or both ofthem) can be absent if the respective input feature vectors are alreadysuitable for the selected pattern matching method. The first stagefeature extraction device 24 can be absent (provided the second stagefeature extraction device 30 exists). In that case, the test vectorincludes speech samples.

The filtered test vectors of the test utterance and the filteredreference vectors of the entirety of reference templates are passed to alocal scoring or matching unit 34, operative to calculate a scorebetween the filtered test feature vectors and each of the correspondingfiltered reference feature vectors. For speech or speaker recognitiontasks, the unit 34 also receives data from a boundary detector 35 whichindicates the beginning and ending points of speech in the testutterance. Any frames, or vectors, of the test utterance which areoutside of the beginning and ending points of speech will not beutilized in the scoring of unit 34.

The boundary detector 35 receives the test utterance from the patternedsignal detection device 16 and determines the beginning and endingpoints usually via inspection of the energy contained in the patternedsignal. Suitable boundary detectors 35 are described in the followingarticles which is incorporated herein by reference:

L. Lamel, L. Rabiner, A. Rosenberg and J. Wilpon, "An Improved EndpointDetector for Isolated Word Recognition," IEEE Transactions on Acoustics,Speech and Signal Processing, ASSP-29, pp. 777-785, 1981.

The local scoring unit 34 typically uses a local distortion measurewhich is robust to white noise. Example local distortion measures areWLR and projection distortion measures as described in the previouslymentioned article by D. Mansour and B. H. Juang, which article isincorporated herein by reference.

The output of unit 34 is a set of local similarity scores where eachscore indicates the similarity between a frame of the test utterance andsingle frames of each one of the reference templates.

The set of local scores is then provided to a decision procedure 36,described hereinbelow, for determining the index, code or symbol of thereference template to whom the test utterance best matches. Theseindices, codes or symbols are the overall output of the matchingprocedure.

For speech or speaker recognition systems, the decision is global in thesense that it is based on the local scores of many test feature vectorsor frames. This is necessary so as to match a number of frames whichmake up an uttered word. Thus, the global score is typically carried outusing a standard Dynamic Time Warping (DTW) procedure on the localscores, described in the following article, incorporated herein byreference:

H. Sakoe and S. Chiba, "Dynamic Programming Optimization for Spoken WordRecognition", IEEE Transactions on Acoustics, Speech and SignalProcessing, Vol. ASSP-26, pp. 43-49, February 1978.

For Vector Quantization (VQ) methods, the decision is local in the sensethat for each filtered test vector of the test utterance, the filteredreference feature vector which best matches it is provided. By "bestmatch" it is meant that the local distortion between the filtered testvector and the filtered reference feature vector is minimal. The outputof the match is the index of the best matched reference vector.

It will be appreciated that for VQ methods, the collection of referencetemplates is known in the art as a "codebook".

For speech recognition, the process can also be performed via HiddenMarkov Modeling (HMM) or DTW-VQ in two stages. The first stage is the VQmethod described above operating on the test and reference vectors andproviding symbols representing the test and reference vectors. Thesecond stage is a scoring stage. For DTW-VQ methods, the local scoringis between symbols. A global score, providing a score for a group ofreference symbols forming a reference word, is then calculated on thelocal scores via DTW.

For scoring via HMM, a model of a word is first built from the symbolsand the global score is then calculated between models using the Viterbyalgorithm or the forward-backward algorithm. Both algorithms aredescribed in the article by L. R. Rabiner, "A Tutorial on Hidden MarkovModels and Selected Applications in Speech Recognition", Proceedings ofthe IEEE, Vol. 77, pp. 257-285, February 1989, which article isincorporated herein by reference.

For recognition of continuous or connected speech the output of unit 34is a series of indices of the reference words recognized in the inputspeech.

Reference is now made to FIG. 2 which shows a schematic block diagram ofthe architecture implementing the system of FIG. 1.

A user codec 40, such as an Intel 2913, from Intel Corporation, receivesthe analog signal from the input device 10 and interfaces with digitalsignal processing circuitry 42, typically a TMS 320C25 from TexasInstruments Corporation.

A memory storage area 44, which typically includes a staticrandom-access memory such as a 32K by 8 bit memory with an access timeof 100 nsec, is connected to the digital signal processing circuitry 42by means of a standard address data and read-write control bus.

The operations of FIG. 1 are typically carried out by software run onthe digital signal processing circuitry 42. The VOX of unit 16 istypically incorporated in software run on the digital signal processingcircuitry 42. It will be appreciated by persons skilled in the art thatthe present invention is not limited to what has been particularly shownand described hereinabove. Rather, the scope of the present invention isdefined only by the claims that follow:

We claim:
 1. A system for matching an input signal, comprising non-whitenoise and a patterned signal corrupted by said non-white noise, to aplurality of reference signals, said system comprising:an estimatorestimating noise features of said non-white noise and producing fromsaid noise features at least one noise whitening filter; a filterfiltering said input signal and said plurality of reference signalsusing said at least one noise whitening filter and producing a filteredinput signal, having a white noise component, and a plurality offiltered reference signals; and a pattern matcher for matching saidfiltered input signal to one of said filtered reference signals. 2.System according to claim 1 and wherein said at least one noisewhitening filter is two noise whitening filters respectively forfiltering the input signal and the reference signals.
 3. Systemaccording to claim 1 including an extractor extracting features of saidinput signal and said reference signals and wherein said filter operatesin a feature domain.
 4. System according to claim 3 and wherein saidfeature domains are selected from the group of data samples, LinearPrediction Coefficients, cepstral coefficients, power spectrum samples,and filter bank energies.
 5. System according to claim 4 and whereinsaid estimator estimates a filter in accordance with the selectedfeature domain.
 6. System according to claim 1 and wherein a featuredomain of the input signal is different than a feature domain of thereference signals.
 7. System according to claim 6 and wherein saidfeature domains are selected from the group of data samples, LinearPrediction Coefficients, cepstral coefficients, power spectrum samples,and filter bank energies.
 8. System according to claim 1 and wherein thepattern matcher performs a pattern matching technique selected from thegroup of Dynamic Time Warping DTW, Hidden Markov Models HMM, or DynamicTime Warping-Vector Quantization DTW-VQ.
 9. System according to claim 1and wherein the input signal is a speech signal.
 10. System according toclaim 1 and wherein said non-white noise is the noise from theenvironment of a movable vehicle.
 11. System according to claim 1 andwherein said non-white noise is the noise from the environment of amoving airplane cockpit.
 12. System according to claim 1 and whereinsaid non-white noise is the noise from the environment of a vibratingmachine.
 13. Speech recognition system for recognizing words found in aspeech signal corrupted by non-white noise, said system comprising:anestimator estimating noise features of said non-white noise and forproducing from said noise features at least one noise whitening filter;a filter filtering said speech signal and a plurality of referencesignals of selected spoken words, said filter using said at least onenoise whitening filter and producing a filtered speech signal and aplurality of filtered reference signals; and a pattern matcher formatching said filtered speech signal to one of said filtered referencesignals thereby recognizing the word in said speech signal.
 14. Speechrecognition system for recognizing a word found in a speech signalcorrupted by non-white noise, said system comprising:a vectorquantization system according to claim 10 producing vector quantizedspeech; and a word matcher receiving a plurality of reference sequencesof symbols relating to said reference signals and a test sequence ofsymbols relating to said speech signal for matching said test sequenceto said reference sequences thereby to recognize said word in saidspeech signal.
 15. A speech recognition system according to claim 14wherein said word matcher performs Dynamic Time Warping (DTW) on saidvector quantized speech.
 16. A speaker verification system using thesystem of claim 15 wherein said reference signals comprises at least oneword spoken by one speaker.
 17. A speech recognition system according toclaim 14 wherein said word matcher performs Hidden Markov Modeling(HMM).
 18. A speaker verification system using the system of claim 17wherein said reference signals comprises at least one word spoken by onespeaker.
 19. A speaker recognition system using the system of claim 14wherein said reference signals comprise one word spoken by a pluralityof different speakers.
 20. A speaker verification system using thesystem of claim 14 wherein said reference signals comprises at least oneword spoken by one speaker.
 21. A speaker recognition system using thesystem of claim 13 wherein said reference signals comprise one wordspoken by a plurality of different speakers.
 22. A speaker verificationsystem using the system of claim 13 wherein said reference signalscomprises at least one word spoken by one speaker.
 23. A VectorQuantization (VQ) system for vector quantizing a speech signal corruptedby non-white noise into a sequence of symbols, said system comprising:anestimator estimating noise features of said non-white noise and forproducing from said noise features at least one noise whitening filter;a filter filtering segments of said speech signal and a plurality ofnumbered reference signals of selected speech segments, said filterusing said at least one noise whitening filter and producing filteredspeech segments and a plurality of filtered reference segments; and apattern matcher for matching each of said filtered speech segments toone of said filtered reference segments and for providing as an output asymbol which is an index of the matched reference segment.
 24. A speakerrecognition system using the system of claim 23 wherein said referencesignals comprise one word spoken by a plurality of different speakers.25. A speaker verification system using the system of claim 23 whereinsaid reference signals comprises at least one word spoken by onespeaker.
 26. A method for matching an input signal, comprising non-whitenoise and a patterned signal corrupted by said non-white noise, to aplurality of reference signals, said method comprising the stepsof:estimating noise features of said non-white noise and producing fromsaid noise features at least one noise whitening filter; filtering saidinput signal and said plurality of reference signals using said at leastone noise whitening filter and producing a filtered input signal, havinga white noise component, and a plurality of filtered reference signals;and matching said filtered input signal to one of said filteredreference signals.
 27. A method according to claim 26 and wherein saidat least one noise whitening filter is two noise whitening filtersrespectively for filtering the input signal and the reference signals.28. A method according to claim 26 including the steps of extractingfeatures of said input signal and said reference signals and whereinsaid step of filtering occurs in a feature domain.
 29. A methodaccording to claim 28 and wherein said feature domains are selected fromthe group of data samples, Linear Prediction Coefficients, cepstralcoefficients, power spectrum samples, and filter bank energies.
 30. Amethod according to claim 28 and wherein said step of estimatingestimates a filter in accordance with the selected feature domain.
 31. Amethod according to claim 26 and wherein a feature domain of the inputsignal is different than a feature domain of the reference signals. 32.A method according to claim 26 and wherein said step of matchingperforms a pattern matching technique selected from the group of DynamicTime Warping DTW, Hidden Markov Models HMM, or Dynamic TimeWarping-Vector Quantization DTW-VQ.
 33. A method according to claim 26and wherein the input signal is a speech signal.